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Abstract The g-gradient is an extension of the classical gradient vector based 
on the concept of Jackson's derivative. Here we introduce the g-gradient method 
for unconstrained global optimization. The main idea behind the method is 
the use of the negative of the g-gradient of the objective function as the search 
direction. In this sense, the method here proposed is a generalization of the 
well-known steepest descent method. The use of Jackson's derivative has shown 
to be an effective mechanism for escaping from local minima. The algorithm 
for the g-gradient method is complemented with strategies to generate the 
parameter q and to compute the step length in a way that the search process 
gradually shifts from global in the beginning to almost local search in the end. 
For testing this new approach, we considered six commonly used test func- 
tions and compared our results with a deterministic global optimization solver 
and three Genetic Algorithms (GAs) considered effective in optimizing mul- 
tidimensional unimodal and multimodal functions. For the multimodal test 
functions, the (/-gradient method outperformed the solver and the GAs, reach- 
ing the minimum, respectively, with a better accuracy and with less function 
evaluations. 

Keywords steepest descent method • Jackson's derivative • g-gradient • 
(/-gradient method 



1 Introduction 

Over the last decades the g-calculus has been connecting mathematics and 
physics in applications that span from quantum theory and statistical me- 
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chanics, to number theory and combinatorics (see [T] and references therein). 
Its history dates back to the beginnings of the last century when, based on pi- 
oneering works of Eulcr and Heine, the English reverend Frank Hilton Jackson 
developed the g-calculus in a systematic way [2]. His work gave rise to gen- 
eralizations of series, functions and special numbers within the context of the 
g-calculus [3J. More important, he introduced the concepts of the g-derivative 
[1] (also known as Jackson's derivative) and the q- integral [5]. 
The ^-derivative of a function / is defined as [6] 



where q is a real number different from 1. In the limiting case of q — >• 1, the 
q-dcrivative reduces to the classical derivative. 

Let f(x) = x n , for example. In this case, the classical derivative of / is 
nx™^ 1 and the q-derivative is [n]x™ _1 , where [n] is the g-analogue of n given 



As q — ¥ 1, [n] tends to n. This definition is used to calculate the q-binomial 
and establish a g-analogue of Taylor's formula that encompasses many results 
such as the Euler's identities for ^-exponential functions, Gauss's q-binomial 
formula, Heine's formula for a q-hypergeometric function, among others math- 
ematical results [TJ. 

Considering arbitrary functions f(x) and g(x), the g-derivative operator 
satisfy the following properties [JJ: 

1) The g-derivative is a linear operator for any constants a and b 



3) The g-derivative of the quotient of f(x) and g(x) is calculated as 




(1) 



D q (af(x) + bg{x)) = aD q f(x) + bD q g{x). 
2) The g-derivative of the product of f(x) and g{x) is given by 
D q (f(x)g(x)) = f(qx)D q g(x)+g(x)D q f(x) 



that, by symmetry, is equivalent to 



D q (f(x)g(x)) = f(x)D q g(x)+g(qx)D q f(x). 




fix) 



) 



g(x)D q f(x) - f(x)D q g(x) 
g{x)g(qx) 



or equivalently 



( 



fix) 



) 



g{qx)D q f(x) - f(qx)D q g(x) 
g(x)g(qx) 
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The chain rule for g-derivatives does not exist, except for a function of the 
form f(u(x)), where u(x) = ax^ , with a,/3 being constants. More details on 
the properties of ^-derivatives can be found in [7]. 

Let a general nonlinear unconstrained optimization problem be defined as 

minF(x), x.= (xi,...,Xi,...,x n ) (2) 

where x g 5ft™ is the vector of the independent variables and F : 5ft™ — > 5ft 
is the objective function. The steepest descent method (also known as the 
gradient descent method) uses information on the gradient of the objective 
function in seeking the optimum. The search direction is given by the negative 
of the gradient of F. This search strategy is an obvious choice since along this 
direction is where the objective function decreases most rapidly. 

Requiring only information about first-derivatives, the steepest descent 
method is attractive because of its limited computational cost and storage 
requirements [Sj. However, for multimodal functions, unless one knows in ad- 
vance where to start from, the search procedure frequently gets stuck in one 
of the local minima. Consequently, the steepest descent method is not rec- 
ommended for real-world optimization problems that arc usually multimodal. 
Nevertheless, because of its inherent simplicity, it represents a good starting 
point for the development of more advanced optimization methods. 

Here we propose a generalization of the steepest descent method in which 
the gradient of the objective function is replaced by its g-analogue. Accord- 
ingly, the search direction is taken as the negative of the g-gradient of F. For 
9 = 1, the here called g-gradient method reduces to the classical steepest de- 
scent method. In order to evaluate the performance of the g-gradient method 
we consider three unimodal and three multimodal test functions commonly 
used as benchmarks. We compare our results with those obtained with the 
Genetic Algorithms (GAs) G3-PCX developed by Deb et al. [§], and the SPC- 
vSBX and SPC-PNX developed by Ballcster and Carter [10] . which previous 
studies have shown to be effective in minimizing multidimensional unimodal 
and multimodal function. The best values generated by our approach are also 
compared with those given by the well-known LINDOGlobal deterministic 
solver available in NEOS Server for Optimization [TTj . 

The rest of the paper is organized as follows. In Section [2] the g-gradient 
vector is defined and its properties are presented. In Section [3] the strategies 
to obtain the parameter q and the step length are described, and an algorithm 
for the g-gradient method is proposed. Section U describes the computational 
experiments and Section [5] discusses the results. Finally, in Section [S] some 
conclusions and future work are presented. 

2 The q-Gradient 

Given a diffcrentiablc function of n variables -F(x), the gradient of F is the 
vector of the n first-order partial derivatives of F. Similarly, the g-gradient is 
the vector of the n first-order partial g-derivatives of F. Thus, let the parameter 
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g be a vector q = (gx, • • • , Qi, ■ ■ ■ , q n ), where qi ^ 1 Vi, the first-order partial 
g-derivative with respect to the variable x% is given by 

D qitXi F(x) = F(X1 '- ■ ■ ' **" ■ ■ ■ ' Xn) ~ F(X1 '- ■■> x »---> x ^ ( 3 ) 

with 

^ (4) 

and 

^,x^(x)| gi= x = ^. (5) 

This framework can be extended to define the g-gradient of a function of n 
variables as 

V q F(x) = [D ftiai f(x) ... / >, ., F(x) ... D qntXn F{x)} (6) 

with the classical gradient being recovered in the limit of qt — > 1, for all 
i = 1, . . . , n. 

Let f(x) be a function of one variable. In this case, the geometric interpre- 
tation of the gradient is simply the slope of the tangent line at x. Similarly, 
the g-gradient of / has also a straightforward geometric interpretation as the 
slope of the secant line passing through the points (x, f(x)) and (qx,f(qx)). 
If the slope of the secant line is positive (negative), the ^-gradient points to 
the right (left) direction. Fig. [T] illustrates this geometric interpretation for 

f{x) =2- (e~ x2 +2e- ( ^ 3)2 ). (7) 

















(x,f(i)h- 
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Fig. 1 Geometric interpretation of the classical derivative (dotted line) and the g-dcrivativc 
of f(x) at x = 1.0 and different values of the parameter q. 



Since the slope of the tangent line (dotted line) at x = 1 is positive, the 
steepest descent method at this point will move necessarily to the left and, 
thus, will be trapped by the local minimum at x = 0. The slope of the secant 
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line passing through the points (x,f(x)) and (qx,f(qx)) can be positive or 
negative depending on the value of the parameter q. For instance, if q = 2 the 
g-dcrivative is negative at x = 1 (see the secant line passing through (x, f{x)) 
and (qAX,f{qix)) in Fig. [T]), which potentially allows a minimization strategy 
based on the value of the g-gradient to take a leap to the right, towards the 
global minimum of /. Note that there is a value 1.5 < q% < 2 for which 
x = 1 is a stationary point of the g-gradient (V q F(x) = 0) but that does not 
correspond to any minimum or maximum of /. Finally, for < q± < 0.5 or 
1 < <72 < 1-5 the slope of the secant line is positive and the g-gradient method 
will move to the left as the steepest descent method. 

The simple example above shows that the use of the g-gradient offers a 
new mechanism to escape from local minima. Moreover, the transition from 
global to local search might be controlled by the parameter g, provided a 
suitable strategy for generating q- values is incorporated into the minimization 
algorithm. 

3 q-Gradient Method Description 

A general optimization strategy is to consider an iterative procedure that, 
starting from x°, generates a sequence {x fc } given by [12] 

x fe+i = x k + a k d k ( 8 ) 

where k is the iteration number, d fe is the search direction and a k is the step 
length or the distance moved along d fc in the iteration k. 

Optimization methods can be characterized according to the direction and 
step length used in ([5]). The steepest descent method sets d fe = — VF(x fc ) 
and the step length a k is usually determined by a line-search technique that 
minimizes the objective function along the direction d fc . In the g-gradient 
method, as here proposed, the search direction is the negative of the g-gradient 
of the objective function — V q i ? (x). Thus the optimization procedure defined 
by becomes 

x k+1 =x k -a k V q F(x k ). (9) 

Key to the performance of the g-gradient method, the strategies used to specify 
the parameter q and the step length a are described below. 

3.1 Parameter q 

Considering a function of n variables F(x), a set of n different parameters 
<?i € 3i — {1} (i = 1, . . . , n) are needed to compute the q-gradient vector of F. 
The overall strategy adopted here is to draw the values of qi (or some variable 
related to them) from some suitable probability density function (pdf), and 
with a standard deviation that decreases as the iterative search proceeds. In 
this sense, the role of the standard deviation here is reminiscent of the one 
played by the temperature in a simulated annealing (SA) algorithm, that is, 
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to make the algorithm go from a very random (at the beginning) to a very 
deterministic search (at the end). 

In the current implementation we opted to first draw the values of qfx k 
from a Gaussian pdf given by 



f( x ) = 7f> \ ex P 



M) 2 



2a 2 



(10) 



with /i = x k and a = a k ; then, we computed the values of q k . 

Starting from er°, the standard deviation of the pdf is decreased by the 
following "cooling" schedule, a k+1 = ■ a k , where < (3 < 1 is the reduction 
factor. As a k approaches zero, the values of q k tend to unity, the algorithm 
reduces to the steepest descent method, and the search process becomes es- 
sentially local. As in a SA algorithm, the performance of the minimization 
algorithm depends crucially on the choice of parameters er° and j3. A too 
rapid decrease of <r k , for example, may cause the algorithm to be trapped in 
a local minimum. 



3.2 Step Length 

The calculation of the step length a is a tradeoff. On the one hand, a should 
give a considerable reduction of the objective function. On the other hand, its 
calculation should not take too many evaluations of F [13] . Steepest descent 
algorithms generally use line-search techniques to determine the step length a k 
along the steepest descent direction d fe = — VF(x fe ) at the iteration k. A first 
version of our algorithm Q3] applied the golden section for step length deter- 
mination. However, traditional line-search algorithms, like the golden section, 
ensure that the condition F(x fe+1 ) < F(x k ) is always satisfied, what obviously 
is a poor strategy when dealing with multimodal minimization problems. In 
addition, depending on the value of q, the negative of the g-gradient may not 
point to the local descent direction. 

One way to circumvent these problems is to use a diminishing step length 
a k , i.e., the initial step length a is reduced by a k+1 = (3 ■ a k , where, for 
the sake of simplicity, /3 is the same reduction factor used to compute a k . As 
the step length decreases (and the values of q k , in parallel, tend to unity), a 
smooth transition to an increasingly local search process occurs. 



3.3 Algorithm for the g-Gradient Method 

Based on the definitions presented in the previous sections, the g-gradicnt 
method for continuous global optimization problems is described as follows. 

Algorithm 1 (g-Gradicnt Method) 

Given x° (initial point), er° > 0, a > and < f3 < I : 
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1) Set k = 

2) Set x fcest = x k 

3) While the stopping criteria are not reached, do: 

a) Generate q fc x fe by a Gaussian distribution with a k and fi k = x fc 

b) Calculate the g-gradient V 9 F(x fe ) by © 

c) Set d k = -V g F(x fc )/||V,F(x fe )|| 

d) Set x fc+1 = x fc + a k ■ d k 

e) If ^(x fe+1 ) < F(x besf ) set x best = x fe+1 

f) Set a k+1 = j3 -a k and a k+1 = /3 ■ a k 

g) Set k = k + 1 

4) Return x 6efrf and F(x best ). 



The main idea behind the q-gradient method is to use the negative of the 
(/-gradient of F, instead of the negative of the classical gradient of F, as the 
search direction. Strategies for generating the vector q fe and the step length 
a k , at each iteration, complement this very simple algorithm. 

Basically, there are three free parameters to be specified, namely, a , a and 
f3. The initial standard deviation a determines how stochastic is the search. 
For multimodal functions, it must be high enough to allow the method to 
properly sample the search space. The reduction factor (3 controls the speed of 
the transition from stochastic to deterministic search. A j3 close to 1 reduces 
the risk of being trapped in a local minimum. The last free parameter, the 
initial step length a°, depends heavily on the topology of the search space 
and, thus, requires some empirical exploration. In the end, as with the choice 
of the cooling schedule in a SA algorithm |15) . an appropriate specification 
of the three free parameters is strictly dependent on the objective function. 
Although a bad choice may lead to some deterioration in its performance, the 
(/-gradient method has shown to be sufficiently robust to still be capable of 
reaching the global minimum. 

The algorithm stops when the appropriate stopping criterium is attained. 
In real- world applications (i.e., in problems for which the global minimum is 
not known), it can be the maximum number of function evaluations, or the 
value of the local gradient ||VF(x fe )|| < e, since the g-gradient method con- 
verges to the steepest descent method at the end of the search. The algorithm 
returns the xj, es t as the minimum value of the objective function F obtained 
during the iterative procedure, i.e., F(xf, est ) < F(x k ), Vfc. 



4 Computational Experiments 

The performance of the q-gradient method was evaluated over six 20- variable 
test functions (n = 20) commonly employed in the literature. We use the 
same experimental setup and stopping criteria as described in [3] and [10) in 
order to allow a direct comparison with their results. The stopping criteria 
are: maximum of 10 6 function evaluations or F(x) < 10~ 20 . 
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As in [3] and [TU], we set the three free parameters in the g-gradient al- 
gorithm after preliminary exploratory runs. The results presented here are 
for those which yielded the best performance. The benchmark consists of the 
following analytical functions: 

1) Ellipsoidal function (F e i p ) 

n 

F elp = j2* x l ( n ) 

i=l 

Although the Ellipsoidal function is convex and unimodal, it is an example 
of a poorly scaled function. 



2) Schwefel's function (F sc h) 

F °° h= ib fe xj ) ■ (i2) 

The Schwefel's function is an extension of the Ellipsoidal function and it 
is also a unimodal and poorly scaled function. 



3) Generalized Rosenbrock's function (F ros ) 

71-1 

F ros = £[100 • (x 2 - x l+1 ) 2 + (1 - x,) 2 ]. (13) 

i=l 

Although the Rosenbrock's function is a well-known unimodal function for 
n = 2, numerical experiments have shown that for 4 < n < 30 the function 
has two minima, the global one at x = 1 and a local minimum that changes 
with the dimensionality n [16j . The Rosenbrock's function is considered a 
test case for premature convergence once the global minimum lays inside 
a long, narrow, and parabolic shaped flat valley. 



4) Ackley's function (F t 



ackl ) 



Facki = 20 + e- 20 cxp -0.2 



i 



i^xfj-cxp^i^co s (2» Xi )^ I") 



The Ackley's function is highly multimodal and the basin of the local 
minima increase in size as one moves away from the global minimum [10) . 



5) Rastrigin's function (F rtg ) 



n 

F rtg = 10n + ^](x 2 - 10cos(27rx i )). 

i=l 



(15) 
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The Rastrigin's function has a parabolic landscape away from the global 
minimum, but as we move towards the global minimum, the size of the 
basins increase. The function is highly multimodal and its characteristics 
are known to be difficult for many optimization algorithms to achieve the 
global minimum |10j . 

6) Rotated Rastrigin's function (F rrtg ). 



The rotated Rastrigin's is a highly multimodal function without local min- 
ima arranged along the axis [10j . 

For all these functions the global minimum is F(x*) = at x* =0, except the 
Generalized Rosenbrock's function where x* = 1. The initial point set x° for 
each function is generated by a uniform distribution within [—10, —5], as used 



5 Results 

5.1 Comparison with GAs 

Extensive comparisons between the GAs G3-PCX ([9] for Ellipsoidal, Schwefel, 
Rosenbrock and Rastrigin functions; [10] for Ackley and Rotated Rastrigin), 
SPC-vSBX and SPC-PNX [TU] and the (/-gradient method are presented in 
Tables [5] and [31 As in [pj and [TU], the best, median and worst columns refer 
to the number of function evaluations required to reach the accuracy 10~ 20 . 
When this condition is not achieved, the best value found so far for the test 
function after 10 6 evaluations is given. The column "Success" refers to how 
many runs reached the target accuracy, for unimodal functions, or ended up 
within the global minimum basin, for multimodal ones. The best performances 
are highlighted in bold in each table. The corresponding values of the best 
parameters cr° , a and /? used in each test function are given in Table [JJ 

In Table [2] for the Ellipsoidal function, the (j-gradient method achieved 
the required accuracy 10 -20 for all 50 runs, with an overall performance sim- 
ilar to the one displayed by the G3-PCX, the best algorithm among the GAs. 
As for the Schwefel's function, the g-gradicnt method again attained the re- 
quired accuracy for all runs but was outperformed by the G3-PCX in terms 
of the number of function evaluations. Finally, for the Rosenbrock's function, 



n 




(16) 



Ai,i = 4/5 
Af it - + i = 3/5 (i odd) 
Aij-i = —3/5 (i even) 
= (otherwise) 



in and [TUJ. 
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Table 1 Parameters used by the q-gradient method over the test functions. 



Functions 


a 






Ellipsoidal 


0.4 


38 


0.86 


Schwcfel 


0.1 


1 


0.997 


Roscnbrock 


0.1 


0.1 


0.9995 


Ackley 


20 


12 


0.90 


Rastrigin 


21 


0.3 


0.9995 


Rotated Rastrigin 


30 


0.5 


0.999 



Table 2 Performance comparison between G3-PCX, SPC-vSBX, SPC-PNX and q-gradicnt 
method over the unimodal test functions in terms of the best, median and worst number of 
function evaluations required to reach the accuracy 10 -20 . The column "Success" refers to 
how many runs reached the target accuracy. 



Function 


Method 


Best 


Median 


Worst 


^■best 


Success 




G3-PCX 


5,826 


6,800 


7,728 


10 


-20 


10/10 


Ellipsoidal 


SPC-vSBX 
SPC-PNX 


49, 084 
36, 360 


50, 952 
39, 360 


57, 479 
40, 905 


io- 

io- 


-20 
-20 


10/10 
10/10 




q-Gradient 


5,905 


7,053 


7,381 


10 


-20 


50/50 




G3-PCX 


13,988 


15,602 


17,188 


10 


-20 


10/10 


Schwefcl 


SPC-vSBX 
SPC-PNX 


260, 442 
236, 342 


294, 231 
283, 321 


334, 743 
299, 301 


io- 
io- 


-20 

-20 


10/10 
10/10 




q-Gradient 


289, 174 


296, 103 


299, 178 


io- 


-20 


50/50 




G3-PCX 


16,508 


21,452 


25,520 


10 


-20 


36/50 


Roscnbrock 


SPC-vSBX 
SPC-PNX 


10 6 
10 6 






io- 
io- 


-4 
-10 


48/50 
38/50 




q-Gradient 


10 6 






io- 


-10 


50/50 



the g-gradient was beaten by the G3-PCX (the only to achieve the required 
accuracy) but performed better then the two other GAs. 

The overall evaluation of the g-gradient method performance in these nu- 
merical experiments with unimodal (or quasi-unimodal) test functions indi- 
cates that it reaches the required accuracy (or the minimum global basin) in 
100% of the runs, but it is not faster than the G3-PCX. This picture im- 
proves a lot when it comes to tackle the multimodal Ackley's and Rastringin's 
functions. 

In Tabic [3J due to limited computing precision the required accuracy for 
the Ackley's function was set to 10~ 10 for the GAs and 10~ 15 in our simula- 
tions!]]. The g-gradient method was here clearly better than the GAs, reaching 
the required accuracy in more runs or in less functions evaluations. For the 

1 Numerical experiments have shown that Ackley's function evaluated at x = with 
double precision is equal to -0.4440892098500626E-015 and not zero. 
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Table 3 Performance comparison between G3-PCX, SPC-vSBX, SPC-PNX and g-gradicnt 
method over the multimodal test functions in terms of the best, median and worst number 
of function evaluations required to reach the accuracy 10 — 20 . The column "Success" refers 
to how many runs reached the target accuracy. 



Function 


Method 


Best 


Median 


Worst 




Success 




G3-PCX 


10 6 




- 


3,959 





Ackley 


SPC-vSBX 


57, 463 


63, 899 


65, 902 


1 a— 10 


10/10 


SPC-PNX 


45, 736 


48, 095 


49, 392 


io- 10 


10/10 




g-Gradient 


11,850 


12,465 


13,039 


io- 15 


50/50 




G3-PCX 


10 6 






15,936 





Rastrigin 


SPC-vSBX 
SPC-PNX 


260, 685 
10 6 


306, 819 


418,482 


io- 20 
4.975 


6/10 





g-Gradient 


676,050 


692,450 


705,037 


10 -20 


48/50 




G3-PCX 


10 6 






309.429 





Rotated 


SPC-vSBX 


10 6 






8.955 





Rastrigin 


SPC-PNX 


10 6 






3.980 







g-Gradient 


541,857 


545,957 


549,114 


10 -20 


20/50 



Rastrigin's function, the G3-PCX and the SPC-PNX were unable to attain 
the global minimum basin. The other two algorithms reached the required 
accuracy 10 -20 , but the g-gradient method was the only to do it in 96% of 
the runs (48 over 50). Finally, in the case of the rotated Rastrigin's function, 
the ^-gradient was the only algorithm to reach the minimum, attaining the 
required accuracy in 20 out of 50 independent runs. Summarizing the results 
with multimodal functions, we may say that the g-gradient method outper- 
formed the GAs in all the three test cases considered, reaching the minimum 
with less function evaluations or in more independent runs. 



5.2 Comparison with Deterministic Solver 

In order to compare our approach with deterministic global optimization, the 
test functions were implemented in GAMS format |17j and solved using the 
LINDOGlobal solver through the version freely available on NEOS Server for 
Optimization jllj . The optimization procedure of the LINDOGlobal solver 
employs branch-and-cut methods to break an nonlinear model down into a list 
of subproblems. In addition, the solver has a multistart feature that restarts the 
standard (non-global) nonlinear solver from a number of intelligently generated 
points that allows LINDOGlobal to find a number of locally optimal points 
and report the best one found [T5] . This feature allows the solver to be applied 
to multimodal functions. Once a optimization problem is submitted to NEOS, 
all additional information required by the optimization solver is determined 
automatically. 
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The LINDOGlobal output file does not provide any information about the 
number of function evaluations, then we only compared the optimal values 
given by the solver and those ones generated by our approach. For the q- 
gradient method we considered the parameters cr°, a and (3 defined in Table 
[TJ The stopping criteria are now the value of local gradient || VF(x fc ) ||< 
10 -20 , or the maximum of 10 6 function evaluations. For the LINDOGlobal, 
we implemented the test functions with the standard options and the same 
initial point set as in the (/-gradient method. The best value for each method 
was taken among the 50 independent runs. 

Results are shown in Table [4] For the unimodal functions, the determinis- 
tic solver, as expected, displays a better performance. However, for the mul- 
timodal functions, the (/-gradient method attained a better accuracy and out- 
performed the LINDOGlobal solver. 



Table 4 The best solution found for GAMS/LINDOGlobal solver and (/-gradient method 
over the test functions. 



Functions 



Ellipsoidal 

Schwcfcl 

Roscnbrock 



q-gradicnt 



LINDOGlobal 





4.4492E - 28 
9.1939E - 11 



Ackley -4.4409E 
Rastrigin 
Rotated Rastrigin 
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-6.0396E - 14 
-1.3358E - 12 
-1.2506E - 12 



6 Conclusions and Future Work 

In this work we introduced the (/-gradient method for global optimization. 
The main idea behind the method is the use of the negative of the (/-gradient 
of the objective function — a generalization of the classical gradient based 
on the Jackson's derivative — as the search direction. The use of Jackson's 
derivative provides us an effective mechanism for escaping from local minima. 
The algorithm is implemented in a way that the search process gradually shifts 
from global in the beginning to almost local search in the end. 

For testing this new approach, we considered six commonly used 20- variable 
test functions. These functions display features of real- world optimization 
problems (multimodality, for example) and are notoriously difficult for op- 
timization algorithms to handle. We compared the (/-gradient method with 
GAs developed by Deb et al. [3] , and Ballester and Carter [TU] with promising 
results. Overall, the (/-gradient method clearly beat the competition in the 
hardest test cases, those dealing with the multimodal functions. 
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It comes without suprise the (relatively) poor results of the q-gradicnt 
method with the Roscnbrock's function, a unimodal test function specially 
difficult to be solved by the steepest descent method. This result highlights 
the need for the development of a g-generalization of the well-known conjugate- 
gradient method, a research line currently being explored. 
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